fix(bigquery/storage/managedwriter): append improvements #5465

shollyman · 2022-02-07T18:45:01Z

This PR addresses two issues. The receiver channel for processing
asyncronous updates is switched to a buffer channel, based on the
allowed append depth.

The second change is that this allows for better context
expiry/cancellation when invoking AppendRows on a managed stream.

This also improves testing with some test refactors, as well as
shaking out some timing issues due the larger queue depth.

Fixes: googleapis#5459 This PR addresses two issues. The receiver channel for processing asyncronous updates is switched to a buffer channel, based on the allowed append depth. The second change is that this allows for better context expiry/cancellation when invoking AppendRows on a managed stream. This also improves testing with some test refactors, as well as shaking out some timing issues due the larger queue depth.

codyoss · 2022-02-07T20:37:31Z

bigquery/storage/managedwriter/managed_stream.go

-		return nil, err
+	// Call the underlying append.  The stream has it's own retained context and will surface expiry on
+	// it's own, but we also need to respect any deadline for the provided context.
+	errCh := make(chan error)


nit: Not sure if you import or use it already but this looks like a great usecase for: errorgroup

This particular use case is handling a single request on a bidi stream and supporting cancellation, but it's certainly something we may end up with if we end up with write multiplexing.

bigquery/storage/managedwriter/managed_stream.go

jsabbatini-upguard · 2022-02-07T21:11:39Z

bigquery/storage/managedwriter/managed_stream.go

+	// it's own, but we also need to respect any deadline for the provided context.
+	errCh := make(chan error)
+	var appendErr error
+	go func() {


There is the potential here that append will block waiting to add a pending write to the pending writes channel. When that happens the context below could expire returning an error from AppendRows. However this goroutine is still running and as soon as the pending channel is freed the append will in fact succeed even if an error was returned.

Furthermore this goroutine not being stopped when the caller context expires means that if append becomes slow due a full pending write channel, rapid calls to AppendRows, even with a short timeouts, could result in a very large number of gorountines left running.

Good point. I was reasoning through backend behaviors, didn't think through this scenario.

Spend some time refactoring this, as there were other correctness issues as well primarily around potential races.

Also, added a test explicitly to watch for goroutine leak.

jsabbatini-upguard · 2022-02-10T01:35:38Z

bigquery/storage/managedwriter/managed_stream.go

 		}
 		if err == nil {
+			// Compute numRows, once we pass ownership to the channel the request may be
+			// cleared.
+			numRows := int64(len(pw.request.GetProtoRows().Rows.GetSerializedRows()))
 			ch <- pw


Should this guard against the otherCtx expiring as well?

Something like

select { case ch <- pw: // We've passed ownership of the pending write to the channel. // It's now responsible for marking the request done, we're done // with the critical section. ms.mu.Unlock() // Record stats and return. recordStat(ms.ctx, AppendRequests, 1) recordStat(ms.ctx, AppendRequestBytes, int64(pw.reqSize)) recordStat(ms.ctx, AppendRequestRows, numRows) return nil case <-otherCtx.Done(): ms.mu.Unlock() return otherCtx.Err() }

That way the call to append is guaranteed not to lock if ch is full and it is guaranteed to return within the given context timeout (within some margins).

And once calls to append are guaranteed to return within the timeout the goroutine in AppendRows could probably be removed and replaced with a synchronous call I believe.

The issue is that the select allows the writer to get out of sync processing append responses, which arrive in the same order as append requests were accepted. We use the channel for maintaining the ordering when processing the responses.

Understood. Thanks.

bigquery/storage/managedwriter/managed_stream.go

bigquery/storage/managedwriter/integration_test.go

bigquery/storage/managedwriter/managed_stream.go

shollyman added 2 commits February 7, 2022 18:36

Merge branch 'main' into deadlock

ed9f7ad

shollyman requested a review from a team February 7, 2022 18:45

shollyman requested a review from a team as a code owner February 7, 2022 18:45

shollyman requested a review from steffnay February 7, 2022 18:45

product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the BigQuery API. labels Feb 7, 2022

shollyman changed the title ~~feat(bigquery/storage/managedwriter): append improvements~~ fix(bigquery/storage/managedwriter): append improvements Feb 7, 2022

shollyman requested review from tswast and codyoss February 7, 2022 18:52

codyoss reviewed Feb 7, 2022

View reviewed changes

jsabbatini-upguard reviewed Feb 7, 2022

View reviewed changes

shollyman added 2 commits February 10, 2022 00:06

revamp PR

4d3b7ba

Merge branch 'main' into deadlock

87cd1f3

product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Feb 10, 2022

jsabbatini-upguard reviewed Feb 10, 2022

View reviewed changes

codyoss reviewed Feb 10, 2022

View reviewed changes

bigquery/storage/managedwriter/managed_stream.go Outdated Show resolved Hide resolved

shollyman added 3 commits February 10, 2022 17:46

improve comments

c7dd310

Merge branch 'main' into deadlock

0dd2a20

minor rename

01bbdd9

codyoss approved these changes Feb 10, 2022

View reviewed changes

tswast self-assigned this Feb 14, 2022

tswast reviewed Feb 14, 2022

View reviewed changes

bigquery/storage/managedwriter/integration_test.go Outdated Show resolved Hide resolved

bigquery/storage/managedwriter/managed_stream.go Show resolved Hide resolved

tswast removed their assignment Feb 14, 2022

shollyman added 4 commits February 14, 2022 17:38

Merge branch 'main' into deadlock

887858a

address some comment issues

1bf8e94

add more details about channel usage

96fa421

Merge branch 'main' into deadlock

e944282

tswast approved these changes Feb 14, 2022

View reviewed changes

Merge branch 'main' into deadlock

6026ab1

shollyman enabled auto-merge (squash) February 14, 2022 21:25

shollyman merged commit aa167bd into googleapis:main Feb 14, 2022

release-please bot mentioned this pull request Feb 14, 2022

chore(main): release bigquery 1.28.0 #5490

Merged

shollyman deleted the deadlock branch February 14, 2022 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bigquery/storage/managedwriter): append improvements #5465

fix(bigquery/storage/managedwriter): append improvements #5465

shollyman commented Feb 7, 2022

codyoss Feb 7, 2022

shollyman Feb 10, 2022

jsabbatini-upguard Feb 7, 2022

shollyman Feb 7, 2022

shollyman Feb 10, 2022

shollyman Feb 10, 2022

jsabbatini-upguard Feb 10, 2022

shollyman Feb 10, 2022

jsabbatini-upguard Feb 10, 2022

fix(bigquery/storage/managedwriter): append improvements #5465

fix(bigquery/storage/managedwriter): append improvements #5465

Conversation

shollyman commented Feb 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment